查看原文
其他

论文周报 | 推荐系统领域最新研究进展

ML_RSer 机器学习与推荐算法 2022-12-14
嘿,记得给“机器学习与推荐算法”添加星标

本文精选了上周(0725-0731)最新发布的17篇推荐系统相关论文。

本次论文集合的方向主要包括基于内容的推荐算法[2,5]、序列推荐算法[3]、基于图神经网络的推荐算法[3,8,15]、捆绑推荐算法[4]、基于对比学习的推荐算法[2,6]、交互式推荐[4,6]、考虑时序偏差的推荐算法[7]、鲁棒推荐算法[9]、基于提示学习的协同过滤增强算法[11]、联邦推荐算法[12]、可信推荐算法综述[14]--350篇文献总结可信推荐系统前沿进展、序列推荐中的安全问题[16]等。

以下整理了论文标题以及摘要,如感兴趣可移步原文精读。

  • 1. Gender In Gender Out: A Closer Look at User Attributes in Context-Aware  Recommendation
  • 2. Exploiting Negative Preference in Content-based Music Recommendation  with Contrastive Learning

  • 3. Factorial User Modeling with Hierarchical Graph Neural Network for  Enhanced Sequential Recommendation

  • 4. Bundle MCR: Towards Conversational Bundle Recommendation

  • 5. Personality-Driven Social Multimedia Content Recommendation

  • 6. Contrastive Learning for Interactive Recommendation in Fashion

  • 7. Exploring the Impact of Temporal Bias in Point-of-Interest  Recommendation

  • 8. Layer-refined Graph Convolutional Networks for Recommendation

  • 9. Multiple Robust Learning for Recommendation

  • 10. The trade-offs of model size in large recommendation models : A 10000x compressed criteo-tb DLRM model (100 GB parameters to mere 10MB)

  • 11. Enhancing Collaborative Filtering Recommender with Prompt-Based  Sentiment Analysis

  • 12. ReFRS: Resource-efficient Federated Recommender System for Dynamic and  Diversified User Preferences

  • 13. JDRec: Practical Actor-Critic Framework for Online Combinatorial  Recommender System

  • 14. A Survey on Trustworthy Recommender Systems

  • 15. Benchmarking GNN-Based Recommender Systems on Intel Optane Persistent  Memory

  • 16. Defending Substitution-Based Profile Pollution Attacks on Sequential  Recommenders

  • 17. A Transferable Recommender Approach for Selecting the Best Density  Functional Approximations in Chemical Discovery

1. Gender In Gender Out: A Closer Look at User Attributes in Context-Aware  Recommendation

Manel Slokom, Özlem Özgöbek, Martha Larson

https://arxiv.org/abs/2207.14218

This paper studies user attributes in light of current concerns in the recommender system community: diversity, coverage, calibration, and data minimization. In experiments with a conventional context-aware recommender system that leverages side information, we show that user attributes do not always improve recommendation. Then, we demonstrate that user attributes can negatively impact diversity and coverage. Finally, we investigate the amount of information about users that 'survives' from the training data into the recommendation lists produced by the recommender. This information is a weak signal that could in the future be exploited for calibration or studied further as a privacy leak.

2. Exploiting Negative Preference in Content-based Music Recommendation  with Contrastive Learning

Minju Park, Kyogu Lee

https://arxiv.org/abs/2207.13909

Advanced music recommendation systems are being introduced along with the development of machine learning. However, it is essential to design a music recommendation system that can increase user satisfaction by understanding users' music tastes, not by the complexity of models. Although several studies related to music recommendation systems exploiting negative preferences have shown performance improvements, there was a lack of explanation on how they led to better recommendations. In this work, we analyze the role of negative preference in users' music tastes by comparing music recommendation models with contrastive learning exploiting preference (CLEP) but with three different training strategies - exploiting preferences of both positive and negative (CLEP-PN), positive only (CLEP-P), and negative only (CLEP-N). We evaluate the effectiveness of the negative preference by validating each system with a small amount of personalized data obtained via survey and further illuminate the possibility of exploiting negative preference in music recommendations. Our experimental results show that CLEP-N outperforms the other two in accuracy and false positive rate. Furthermore, the proposed training strategies produced a consistent tendency regardless of different types of front-end musical feature extractors, proving the stability of the proposed method.

3. Factorial User Modeling with Hierarchical Graph Neural Network for  Enhanced Sequential Recommendation

Lyuxin Xue, Deqing Yang, Yanghua Xiao

https://arxiv.org/abs/2207.13262

Most sequential recommendation (SR) systems employing graph neural networks (GNNs) only model a user's interaction sequence as a flat graph without hierarchy, overlooking diverse factors in the user's preference. Moreover, the timespan between interacted items is not sufficiently utilized by previous models, restricting SR performance gains. To address these problems, we propose a novel SR system employing a hierarchical graph neural network (HGNN) to model factorial user preferences. Specifically, a timespan-aware sequence graph (TSG) for the target user is first constructed with the timespan among interacted items. Next, all original nodes in TSG are softly clustered into factor nodes, each of which represents a certain factor of the user's preference. At last, all factor nodes' representations are used together to predict SR results. Our extensive experiments upon two datasets justify that our HGNN-based factorial user modeling obtains better SR performance than the state-of-the-art SR models.

4. Bundle MCR: Towards Conversational Bundle Recommendation

Zhankui He, Handong Zhao, Tong Yu, Sungchul Kim, Fan Du, Julian McAuley

https://arxiv.org/abs/2207.12628

Bundle recommender systems recommend sets of items (e.g., pants, shirt, and shoes) to users, but they often suffer from two issues: significant interaction sparsity and a large output space. In this work, we extend multi-round conversational recommendation (MCR) to alleviate these issues. MCR, which uses a conversational paradigm to elicit user interests by asking user preferences on tags (e.g., categories or attributes) and handling user feedback across multiple rounds, is an emerging recommendation setting to acquire user feedback and narrow down the output space, but has not been explored in the context of bundle recommendation. In this work, we propose a novel recommendation task named Bundle MCR. We first propose a new framework to formulate Bundle MCR as Markov Decision Processes (MDPs) with multiple agents, for user modeling, consultation and feedback handling in bundle contexts. Under this framework, we propose a model architecture, called Bundle Bert (Bunt) to (1) recommend items, (2) post questions and (3) manage conversations based on bundle-aware conversation states. Moreover, to train Bunt effectively, we propose a two-stage training strategy. In an offline pre-training stage, Bunt is trained using multiple cloze tasks to mimic bundle interactions in conversations. Then in an online fine-tuning stage, Bunt agents are enhanced by user interactions. Our experiments on multiple offline datasets as well as the human evaluation show the value of extending MCR frameworks to bundle settings and the effectiveness of our Bunt design.

5. Personality-Driven Social Multimedia Content Recommendation

Qi Yang, Sergey Nikolenko, Alfred Huang, Aleksandr Farseev

https://arxiv.org/abs/2207.12236

Social media marketing plays a vital role in promoting brand and product values to wide audiences. In order to boost their advertising revenues, global media buying platforms such as Facebook Ads constantly reduce the reach of branded organic posts, pushing brands to spend more on paid media ads. In order to run organic and paid social media marketing efficiently, it is necessary to understand the audience, tailoring the content to fit their interests and online behaviours, which is impossible to do manually at a large scale. At the same time, various personality type categorization schemes such as the Myers-Briggs Personality Type indicator make it possible to reveal the dependencies between personality traits and user content preferences on a wider scale by categorizing audience behaviours in a unified and structured manner. This problem is yet to be studied in depth by the research community, while the level of impact of different personality traits on content recommendation accuracy has not been widely utilised and comprehensively evaluated so far. Specifically, in this work we investigate the impact of human personality traits on the content recommendation model by applying a novel personality-driven multi-view content recommender system called Personality Content Marketing Recommender Engine, or PersiC. Our experimental results and real-world case study demonstrate not just PersiC's ability to perform efficient human personality-driven multi-view content recommendation, but also allow for actionable digital ad strategy recommendations, which when deployed are able to improve digital advertising efficiency by over 420% as compared to the original human-guided approach.

6. Contrastive Learning for Interactive Recommendation in Fashion

Karin Sevegnani, Arjun Seshadri, Tian Wang, Anurag Beniwal, Julian McAuley, Alan Lu, Gerard Medioni

https://arxiv.org/abs/2207.12033

Recommender systems and search are both indispensable in facilitating personalization and ease of browsing in online fashion platforms. However, the two tools often operate independently, failing to combine the strengths of recommender systems to accurately capture user tastes with search systems' ability to process user queries. We propose a novel remedy to this problem by automatically recommending personalized fashion items based on a user-provided text request. Our proposed model, WhisperLite, uses contrastive learning to capture user intent from natural language text and improves the recommendation quality of fashion products. WhisperLite combines the strength of CLIP embeddings with additional neural network layers for personalization, and is trained using a composite loss function based on binary cross entropy and contrastive loss. The model demonstrates a significant improvement in offline recommendation retrieval metrics when tested on a real-world dataset collected from an online retail fashion store, as well as widely used open-source datasets in different e-commerce domains, such as restaurants, movies and TV shows, clothing and shoe reviews. We additionally conduct a user study that captures user judgements on the relevance of the model's recommended items, confirming the relevancy of WhisperLite's recommendations in an online setting.

7. Exploring the Impact of Temporal Bias in Point-of-Interest  Recommendation

Hossein A. Rahmani, Mohammadmehdi Naghiaei, Ali Tourani, Yashar Deldjoo

https://arxiv.org/abs/2207.11609

Recommending appropriate travel destinations to consumers based on contextual information such as their check-in time and location is a primary objective of Point-of-Interest (POI) recommender systems. However, the issue of contextual bias (i.e., how much consumers prefer one situation over another) has received little attention from the research community. This paper examines the effect of temporal bias, defined as the difference between users' check-in hours, leisure vs.~work hours, on the consumer-side fairness of context-aware recommendation algorithms. We believe that eliminating this type of temporal (and geographical) bias might contribute to a drop in traffic-related air pollution, noting that rush-hour traffic may be more congested. To surface effective POI recommendations, we evaluated the sensitivity of state-of-the-art context-aware models to the temporal bias contained in users' check-in activities on two POI datasets, namely Gowalla and Yelp. The findings show that the examined context-aware recommendation models prefer one group of users over another based on the time of check-in and that this preference persists even when users have the same amount of interactions.

8. Layer-refined Graph Convolutional Networks for Recommendation

Xin Zhou, Donghui Lin, Yong Liu, Chunyan Miao

https://arxiv.org/abs/2207.11088

Recommendation models utilizing Graph Convolutional Networks (GCNs) have achieved state-of-the-art performance, as they can integrate both the node information and the topological structure of the user-item interaction graph. However, these GCN-based recommendation models not only suffer from over-smoothing when stacking too many layers but also bear performance degeneration resulting from the existence of noise in user-item interactions. In this paper, we first identify a recommendation dilemma of over-smoothing and solution collapsing in current GCN-based models. Specifically, these models usually aggregate all layer embeddings for node updating and achieve their best recommendation performance within a few layers because of over-smoothing. Conversely, if we place learnable weights on layer embeddings for node updating, the weight space will always collapse to a fixed point, at which the weighting of the ego layer almost holds all. We propose a layer-refined GCN model, dubbed LayerGCN, that refines layer representations during information propagation and node updating of GCN. Moreover, previous GCN-based recommendation models aggregate all incoming information from neighbors without distinguishing the noise nodes, which deteriorates the recommendation performance. Our model further prunes the edges of the user-item interaction graph following a degree-sensitive probability instead of the uniform distribution. Experimental results show that the proposed model outperforms the state-of-the-art models significantly on four public datasets with fast training convergence. The implementation code of the proposed method is available at

9. Multiple Robust Learning for Recommendation

Haoxuan Li, Quanyu Dai, Yuru Li, Yan Lyu, Zhenhua Dong, Peng Wu, Xiao-Hua Zhou

https://arxiv.org/abs/2207.10796

In recommender systems, a common problem is the presence of various biases in the collected data, which deteriorates the generalization ability of the recommendation models and leads to inaccurate predictions. Doubly robust (DR) learning has been studied in many tasks in RS, with the advantage that unbiased learning can be achieved when either a single imputation or a single propensity model is accurate. In this paper, we propose a multiple robust (MR) estimator that can take the advantage of multiple candidate imputation and propensity models to achieve unbiasedness. Specifically, the MR estimator is unbiased when any of the imputation or propensity models, or a linear combination of these models is accurate. Theoretical analysis shows that the proposed MR is an enhanced version of DR when only having a single imputation and propensity model, and has a smaller bias. Inspired by the generalization error bound of MR, we further propose a novel multiple robust learning approach with stabilization. We conduct extensive experiments on real-world and semi-synthetic datasets, which demonstrates the superiority of the proposed approach over state-of-the-art methods.

10. The trade-offs of model size in large recommendation models : A 10000x compressed criteo-tb DLRM model (100 GB parameters to mere 10MB)

Aditya Desai, Anshumali Shrivastava

https://arxiv.org/abs/2207.10731

Embedding tables dominate industrial-scale recommendation model sizes, using up to terabytes of memory. A popular and the largest publicly available machine learning MLPerf benchmark on recommendation data is a Deep Learning Recommendation Model (DLRM) trained on a terabyte of click-through data. It contains 100GB of embedding memory (25+Billion parameters). DLRMs, due to their sheer size and the associated volume of data, face difficulty in training, deploying for inference, and memory bottlenecks due to large embedding tables. This paper analyzes and extensively evaluates a generic parameter sharing setup (PSS) for compressing DLRM models. We show theoretical upper bounds on the learnable memory requirements for achieving approximations to the embedding table. Our bounds indicate exponentially fewer parameters suffice for good accuracy. To this end, we demonstrate a PSS DLRM reaching 10000 compression on criteo-tb without losing quality. Such a compression, however, comes with a caveat. It requires 4.5 more iterations to reach the same saturation quality. The paper argues that this tradeoff needs more investigations as it might be significantly favorable. Leveraging the small size of the compressed model, we show a 4.3 improvement in training latency leading to similar overall training times. Thus, in the tradeoff between system advantage of a small DLRM model vs. slower convergence, we show that scales are tipped towards having a smaller DLRM model, leading to faster inference, easier deployment, and similar training times.

11. Enhancing Collaborative Filtering Recommender with Prompt-Based  Sentiment Analysis

Elliot Dang, Zheyuan Hu, Tong Li

https://arxiv.org/abs/2207.12883

Collaborative Filtering(CF) recommender is a crucial application in the online market and ecommerce. However, CF recommender has been proven to suffer from persistent problems related to sparsity of the user rating that will further lead to a cold-start issue. Existing methods address the data sparsity issue by applying token-level sentiment analysis that translate text review into sentiment scores as a complement of the user rating. In this paper, we attempt to optimize the sentiment analysis with advanced NLP models including BERT and RoBERTa, and experiment on whether the CF recommender has been further enhanced. We build the recommenders on the Amazon US Reviews dataset, and tune the pretrained BERT and RoBERTa with the traditional fine-tuned paradigm as well as the new prompt-based learning paradigm. Experimental result shows that the recommender enhanced with the sentiment ratings predicted by the fine-tuned RoBERTa has the best performance, and achieved 30.7% overall gain by comparing MAP, NDCG and precision at K to the baseline recommender. Prompt-based learning paradigm, although superior to traditional fine-tune paradigm in pure sentiment analysis, fail to further improve the CF recommender.

12. ReFRS: Resource-efficient Federated Recommender System for Dynamic and  Diversified User Preferences

Mubashir Imran, Hongzhi Yin, Tong Chen, Nguyen Quoc Viet Hung, Alexander Zhou, Kai Zheng

https://arxiv.org/abs/2207.13897

Owing to its nature of scalability and privacy by design, federated learning (FL) has received increasing interest in decentralized deep learning. FL has also facilitated recent research on upscaling and privatizing personalized recommendation services, using on-device data to learn recommender models locally. These models are then aggregated globally to obtain a more performant model, while maintaining data privacy. Typically, federated recommender systems (FRSs) do not consider the lack of resources and data availability at the end-devices. In addition, they assume that the interaction data between users and items is i.i.d. and stationary across end-devices, and that all local recommender models can be directly averaged without considering the user's behavioral diversity. However, in real scenarios, recommendations have to be made on end-devices with sparse interaction data and limited resources. Furthermore, users' preferences are heterogeneous and they frequently visit new items. This makes their personal preferences highly skewed, and the straightforwardly aggregated model is thus ill-posed for such non-i.i.d. data. In this paper, we propose Resource Efficient Federated Recommender System (ReFRS) to enable decentralized recommendation with dynamic and diversified user preferences. On the device side, ReFRS consists of a lightweight self-supervised local model built upon the variational autoencoder for learning a user's temporal preference from a sequence of interacted items. On the server side, ReFRS utilizes a semantic sampler to adaptively perform model aggregation within each identified user cluster. The clustering module operates in an asynchronous and dynamic manner to support efficient global model update and cope with shifting user interests. As a result, ReFRS achieves superior performance in terms of both accuracy and scalability, as demonstrated by comparative experiments.

13. JDRec: Practical Actor-Critic Framework for Online Combinatorial  Recommender System

Xin Zhao (1), Zhiwei Fang (1), Yuchen Guo (2), Jie He (1), Wenlong Chen (1), Changping Peng (1) ((1) JD.com, (2) Tsinghua University)

https://arxiv.org/abs/2207.13311

A combinatorial recommender (CR) system feeds a list of items to a user at a time in the result page, in which the user behavior is affected by both contextual information and items. The CR is formulated as a combinatorial optimization problem with the objective of maximizing the recommendation reward of the whole list. Despite its importance, it is still a challenge to build a practical CR system, due to the efficiency, dynamics, personalization requirement in online environment. In particular, we tear the problem into two sub-problems, list generation and list evaluation. Novel and practical model architectures are designed for these sub-problems aiming at jointly optimizing effectiveness and efficiency. In order to adapt to online case, a bootstrap algorithm forming an actor-critic reinforcement framework is given to explore better recommendation mode in long-term user interaction. Offline and online experiment results demonstrate the efficacy of proposed JDRec framework. JDRec has been applied in online JD recommendation, improving click through rate by 2.6% and synthetical value for the platform by 5.03%. We will publish the large-scale dataset used in this study to contribute to the research community.

14. A Survey on Trustworthy Recommender Systems

Yingqiang Ge, Shuchang Liu, Zuohui Fu, Juntao Tan, Zelong Li, Shuyuan Xu, Yunqi Li, Yikun Xian, Yongfeng Zhang

https://arxiv.org/abs/2207.12515

Recommender systems (RS), serving at the forefront of Human-centered AI, are widely deployed in almost every corner of the web and facilitate the human decision-making process. However, despite their enormous capabilities and potential, RS may also lead to undesired counter-effects on users, items, producers, platforms, or even the society at large, such as compromised user trust due to non-transparency, unfair treatment of different consumers, or producers, privacy concerns due to extensive use of user's private data for personalization, just to name a few. All of these create an urgent need for Trustworthy Recommender Systems (TRS) so as to mitigate or avoid such adverse impacts and risks. In this survey, we will introduce techniques related to trustworthy and responsible recommendation, including but not limited to explainable recommendation, fairness in recommendation, privacy-aware recommendation, robustness in recommendation, user controllable recommendation, as well as the relationship between these different perspectives in terms of trustworthy and responsible recommendation. Through this survey, we hope to deliver readers with a comprehensive view of the research area and raise attention to the community about the importance, existing research achievements, and future research directions on trustworthy recommendation.

15. Benchmarking GNN-Based Recommender Systems on Intel Optane Persistent  Memory

Yuwei Hu, Jiajie Li, Zhongming Yu, Zhiru Zhang

https://arxiv.org/abs/2207.11918

Graph neural networks (GNNs), which have emerged as an effective method for handling machine learning tasks on graphs, bring a new approach to building recommender systems, where the task of recommendation can be formulated as the link prediction problem on user-item bipartite graphs. Training GNN-based recommender systems (GNNRecSys) on large graphs incurs a large memory footprint, easily exceeding the DRAM capacity on a typical server. Existing solutions resort to distributed subgraph training, which is inefficient due to the high cost of dynamically constructing subgraphs and significant redundancy across subgraphs.

16. Defending Substitution-Based Profile Pollution Attacks on Sequential  Recommenders

Zhenrui Yue, Huimin Zeng, Ziyi Kou, Lanyu Shang, Dong Wang

https://arxiv.org/abs/2207.11237

While sequential recommender systems achieve significant improvements on capturing user dynamics, we argue that sequential recommenders are vulnerable against substitution-based profile pollution attacks. To demonstrate our hypothesis, we propose a substitution-based adversarial attack algorithm, which modifies the input sequence by selecting certain vulnerable elements and substituting them with adversarial items. In both untargeted and targeted attack scenarios, we observe significant performance deterioration using the proposed profile pollution algorithm. Motivated by such observations, we design an efficient adversarial defense method called Dirichlet neighborhood sampling. Specifically, we sample item embeddings from a convex hull constructed by multi-hop neighbors to replace the original items in input sequences. During sampling, a Dirichlet distribution is used to approximate the probability distribution in the neighborhood such that the recommender learns to combat local perturbations. Additionally, we design an adversarial training method tailored for sequential recommender systems. In particular, we represent selected items with one-hot encodings and perform gradient ascent on the encodings to search for the worst case linear combination of item embeddings in training. As such, the embedding function learns robust item representations and the trained recommender is resistant to test-time adversarial examples. Extensive experiments show the effectiveness of both our attack and defense methods, which consistently outperform baselines by a significant margin across model architectures and datasets.

17. A Transferable Recommender Approach for Selecting the Best Density  Functional Approximations in Chemical Discovery

Chenru Duan, Aditya Nandy, Ralf Meyer, Naveen Arunachalam, Heather J. Kulik

https://arxiv.org/abs/2207.10747

Approximate density functional theory (DFT) has become indispensable owing to its cost-accuracy trade-off in comparison to more computationally demanding but accurate correlated wavefunction theory. To date, however, no single density functional approximation (DFA) with universal accuracy has been identified, leading to uncertainty in the quality of data generated from DFT. With electron density fitting and transfer learning, we build a DFA recommender that selects the DFA with the lowest expected error with respect to gold standard but cost-prohibitive coupled cluster theory in a system-specific manner. We demonstrate this recommender approach on vertical spin-splitting energy evaluation for challenging transition metal complexes. Our recommender predicts top-performing DFAs and yields excellent accuracy (ca. 2 kcal/mol) for chemical discovery, outperforming both individual transfer learning models and the single best functional in a set of 48 DFAs. We demonstrate the transferability of the DFA recommender to experimentally synthesized compounds with distinct chemistry.

欢迎干货投稿 \ 论文宣传 \ 合作交流

推荐阅读

350篇文献总结可信推荐系统前沿进展

基于对抗学习的隐私保护推荐算法

KDD2022推荐系统论文集锦(附pdf下载)

由于公众号试行乱序推送,您可能不再准时收到机器学习与推荐算法的推送。为了第一时间收到本号的干货内容, 请将本号设为星标,以及常点文末右下角的“在看”。

喜欢的话点个在看吧👇

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存